Search CORE

17 research outputs found

A Fast and Scalable Graph Coloring Algorithm for Multi-core and Many-core Architectures

Author: D Chakrabarti
H Cougny De
MM Strout
MR Garey
MT Jones
ÜV Çatalyürek
Publication venue
Publication date: 18/05/2015
Field of study

Irregular computations on unstructured data are an important class of problems for parallel programming. Graph coloring is often an important preprocessing step, e.g. as a way to perform dependency analysis for safe parallel execution. The total run time of a coloring algorithm adds to the overall parallel overhead of the application whereas the number of colors used determines the amount of exposed parallelism. A fast and scalable coloring algorithm using as few colors as possible is vital for the overall parallel performance and scalability of many irregular applications that depend upon runtime dependency analysis. Catalyurek et al. have proposed a graph coloring algorithm which relies on speculative, local assignment of colors. In this paper we present an improved version which runs even more optimistically with less thread synchronization and reduced number of conflicts compared to Catalyurek et al.'s algorithm. We show that the new technique scales better on multi-core and many-core systems and performs up to 1.5x faster than its predecessor on graphs with high-degree vertices, while keeping the number of colors at the same near-optimal levels.Comment: To appear in the proceedings of Euro Par 201

arXiv.org e-Print Archive

Crossref

Spiral - Imperial College Digital Repository

Diagrammes de puissance restreint sur le GPU

Author: Alexander S.
Batista V. H.
De Cougny H. L.
Delage C.
Fei Y.
Gonzalez R. E.
Lien S.
Lévy B.
Mérigot Q.
Nikos C.
Remacle J.-F.
The CGAL Project
Watson D.
Xin S.-Q.
Publication venue: HAL CCSD
Publication date: 03/05/2021
Field of study

International audienceWe propose a method to simultaneously decompose a 3D object into power diagram cells and to integrate given functions in each of the obtained simple regions. We offer a novel, highly parallel algorithm that lends itself to an efficient GPU implementation. It is optimized for algorithms that need to compute many decompositions, for instance, centroidal Voronoi tesselation algorithms and incompressible fluid dynamics simulations. We propose an efficient solution that directly evaluates the integrals over every cell without computing the power diagram explicitly and without intersecting it with a tetrahedralization of the domain. Most computations are performed on the fly, without storing the power diagram. We manipulate a triangulation of the boundary of the domain (instead of tetrahedralizing the domain) to speed up the process. Moreover, the cells are treated independently one from another, making it possible to trivially scale up on a parallel architecture. Despite recent Voronoi diagram generation methods optimized for the GPU, computing integrals over restricted power diagrams still poses significant challenges; the restriction to a complex simulation domain is difficult and likely to be slow. It is not trivial to determine when a cell of a power diagram is completely computed, and the resulting integrals (e.g. the weighted Laplacian operator matrix) do not fit into fast (shared) GPU memory. We address all these issues and boost the performance of the state-of-the-art algorithms by a factor 2 to 3 for (unrestricted) Voronoi diagrams and a ×50 speed-up with respect to CPU implementations for restricted power diagrams. An essential ingredient to achieve this is our new scheduling strategy that allows us to treat each Voronoi/power diagram cell with optimal settings and to benefit from the fast memory

Crossref

INRIA a CCSD electronic archive server

Parallel Automatic Adaptive Analysis

Author: C. Ozturan
C.L. Bottasso
H. De Cougny
J. Flaherty
M. Shephard
M.L. Simone
Publication venue
Publication date: 01/01/1997
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

Scaling Up Multiphysics

Author: C Walshaw
DL Marcum
F Togashi
G Karypis
GE Blelloch
H Cougny de
HL Cougny de
NP Weatherill
PL George
R Löhner
R Said
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A Parallel Navier-Stokes Method and Grid Adapter with Hybrid Prismatic/Tetrahedral Grids

Author: Das R.
de Cougny H. L.
Kallinderis A.
Kallinderis Y.
Kallinderis Y.
Parthasarathy V.
Venkatakrishnan V.
Publication venue
Publication date: 01/01/1995
Field of study

A parallel finite-volume method for the NavierStokes equations with adaptive hybrid prismatic / tetrahedral grids is presented and evaluated in terms of parallel performance. The solver is a central type differencing scheme with Lax-Wendroff marching in time. The grid adapter combines directional with isotropic local refinement of the prisms and tetrahedra. The hybrid solver, as well as the grid adapter are implemented on the Intel Paragon MIMD architecture. Reduction in execution time with increasing number of processors is close to linear. A parallel communication strategy is presented and the resulting communication times remain about the same with an increasing number of processors. Subdivision of the grids into subdomains is based on the co-ordinates of the cell centroids and different partitionings of the hybrid meshes are considered. The execution times for parallel solution of viscous flow around the HSCT configuration with hybrid grids are presented for different grid partit..

CiteSeerX

Crossref

Parallel automatic adaptive analysis, Parallel Comput 23

Author: C. L. Bottasso
C. Ozturan
H. L. De Cougny
J. E. Flaherty
M. L. Simone
M. S. Shephard
Publication venue
Publication date
Field of study

CiteSeerX

Parallel Automated Adaptive Procedures for Unstructured Meshes

Author: C. L. Bottasso
C. Ozturan
H. L. De Cougny
J. E. Flaherty
M. S. Shephard
M. W. Beall
Shephard Flaherty
Publication venue
Publication date
Field of study

Contents 1. Introduction 2. Parallel Control of Evolving Meshes 2.1 Mesh Data Structure to Support Geometry-Based Automated Adaptive Analysis 2.2 Partition Communication and Mesh Migration 2.2.1 Requirements of PMDB and Related Efforts 2.2.2 Distributed Mesh Model and Notation Used 2.2.3 Data Structures 2.2.4 Mesh Migration 2.2.5 Scalability of Mesh Migration and Extensions 2.3 Dynamic Load Balancing of Adaptively Evolving Meshes 2.3.1 Geometry-Based Dynamic Balancing Procedures 2.3.2 Topologically-Based Dynamic Balancing Procedures 3. Parallel Automatic Mesh Generation 3.1 Introduction 3.2 Background and Meshing Approach 3.3 Sequential Region Meshing 3.3.1 Underlying Octree 3.3.2 Template Meshing of Interior Octants 3.3.3 Face Removal 3.4 Parallel Constructs Required 3.4.1 Octree and Mesh Data Structures 3.4.2 Multiple Octant Migration 3.4.3 Dynamic Repartitioning 3.5 Parallel Region Meshing 3.5.1 Underlying Octree 3.5.2 Template Meshing of Interior Octants 3.5.3 Face Remova

CiteSeerX

Dynamic Parallel Adaption for Three Dimensional Unstructured Meshes: Application to Interface Tracking

Author: A. Basermann
A. Dervieux
A. Tam
F. Alauzet
H. Cougny de
M. Castro-Diaz
O.C. Zienkiewicz
O.C. Zienkiewicz
R. Almeida
T. Coupez
T. Coupez
W. Huang
Y. Mesri
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Crossref

A New Approach for Improvement of Polygonal Meshes Representing Surfaces with Sharp Edges and Boundaries

Author: Alboul L. and van Damme, R.
DE COUGNY H L
Dyn N., Hormann, K., Kim, S.-J. an
Freitag L.A.
Frey P. and Borouchaki, H.
Hoppe H., Duchamp, T., McDonald, J
I. SEMENOVA
Ichiro HAGIWARA
Vladimir SAVCHENKO
Publication venue: 'Japan Society of Mechanical Engineers'
Publication date: 01/01/2005
Field of study

Crossref